智能论文笔记

Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization

Zikai Xiong , Robert M. Freund

分类：机器学习 | (统计)机器学习

2022-08-30

由于迭代元素的结构诱导属性，尤其是在可行的集合上的线性最小化相比，弗兰克 - 沃尔夫方法在统计和机器学习应用中变得越来越有用，尤其是在线性最小化的设置上比投影更有效。在经验风险最小化的设置中，统计和机器学习中的基本优化问题之一 - 弗兰克 - 沃尔夫方法的计算有效性通常在数据观察数$ n $的数量中线性增长。这与典型随机投影方法的情况形成鲜明对比。为了减少对$ n $的依赖性，我们将寻求典型平滑损耗功能的二阶平滑度（例如，最小二乘损失和逻辑损失），我们建议使用泰勒串联序列的Frank-Wolfe方法修改Frank-Wolfe方法，包括确定性和随机设置的变体。与当前的最新方法相比，最佳公差$ \ varepsilon $足够小，我们的方法能够同时减少对大$ n $的依赖，同时获得Frank-Wolfe方法的最佳收敛速率，在凸和非凸设置中。我们还提出了一种新型的自适应阶梯尺寸方法，我们可以为其提供计算保证。最后，我们提出的计算实验表明，我们的方法对凸面和非convex二进制分类问题的现有数据集上的现有方法表现出非常明显的速度。

translated by 谷歌翻译

Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization

Geoffrey Négiar , Gideon Dresdner , Alicia Tsai , Laurent El Ghaoui , Francesco Locatello , Robert M. Freund , Fabian Pedregosa

分类：机器学习

2020-02-27

我们提出了一种新颖的随机弗兰克 - 沃尔夫（又名条件梯度）算法，用于使用广义的线性预测/结构进行约束的平滑有限和最小化。这类问题包括稀疏，低级别或其他结构化约束的经验风险最小化。提出的方法易于实现，不需要阶梯尺寸调整，并且具有独立于数据集大小的恒定触电成本。此外，作为该方法的副产品，我们获得了Frank-Wolfe间隙的随机估计器，可以用作停止标准。根据设置，提出的方法匹配或改进了随机Frank-Wolfe算法的最佳计算保证。几个数据集上的基准强调了不同的策略，其中所提出的方法比相关方法表现出更快的经验收敛性。最后，我们在开源软件包中提供了所有考虑的方法的实现。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Transformer-based normative modelling for anomaly detection of early schizophrenia

Pedro F Da Costa , Jessica Dafflon , Sergio Leonardo Mendes , João Ricardo Sato , M. Jorge Cardoso , Robert Leech , Emily JH Jones , Walter H. L. Pinaya

分类：机器学习 | 人工智能

2022-12-08

Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches have surged as an alternative method. By using a generative model to learn the distribution of healthy brain data patterns, we can identify the presence of pathologies as deviations or outliers from the distribution learned by the model. In particular, deep generative models showed great results as normative models to identify neurological lesions in the brain. However, unlike most neurological lesions, psychiatric disorders present subtle changes widespread in several brain regions, making these alterations challenging to identify. In this work, we evaluate the performance of transformer-based normative models to detect subtle brain changes expressed in adolescents and young adults. We trained our model on 3D MRI scans of neurotypical individuals (N=1,765). Then, we obtained the likelihood of neurotypical controls and psychiatric patients with early-stage schizophrenia from an independent dataset (N=93) from the Human Connectome Project. Using the predicted likelihood of the scans as a proxy for a normative score, we obtained an AUROC of 0.82 when assessing the difference between controls and individuals with early-stage schizophrenia. Our approach surpassed recent normative methods based on brain age and Gaussian Process, showing the promising use of deep generative models to help in individualised analyses.

translated by 谷歌翻译

Deep Learning Generates Synthetic Cancer Histology for Explainability and Education

James M. Dolezal , Rachelle Wolk , Hanna M. Hieromnimon , Frederick M. Howard , Andrew Srisuwananukorn , Dmitry Karpeyev , Siddhi Ramesh , Sara Kochanny , Jung Woo Kwon , Meghana Agni

分类：计算机视觉

2022-11-12

Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.

translated by 谷歌翻译

Combining Multi-Fidelity Modelling and Asynchronous Batch Bayesian Optimization

Jose Pablo Folch , Robert M Lee , Behrang Shafei , David Walz , Calvin Tsay , Mark van der Wilk , Ruth Misener

分类：机器学习 | (统计)机器学习

2022-11-11

Bayesian Optimization is a useful tool for experiment design. Unfortunately, the classical, sequential setting of Bayesian Optimization does not translate well into laboratory experiments, for instance battery design, where measurements may come from different sources and their evaluations may require significant waiting times. Multi-fidelity Bayesian Optimization addresses the setting with measurements from different sources. Asynchronous batch Bayesian Optimization provides a framework to select new experiments before the results of the prior experiments are revealed. This paper proposes an algorithm combining multi-fidelity and asynchronous batch methods. We empirically study the algorithm behavior, and show it can outperform single-fidelity batch methods and multi-fidelity sequential methods. As an application, we consider designing electrode materials for optimal performance in pouch cells using experiments with coin cells to approximate battery performance.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Automated Quality Controlled Analysis of 2D Phase Contrast Cardiovascular Magnetic Resonance Imaging

Emily Chan , Ciaran O'Hanlon , Carlota Asegurado Marquez , Marwenie Petalcorin , Jorge Mariscal-Harana , Haotian Gu , Raymond J. Kim , Robert M. Judd , Phil Chowienczyk , Julia A. Schnabel

分类：计算机视觉

2022-09-28

使用相对比心脏磁共振成像（PC-CMR）进行的流量分析可以量化用于评估心血管功能的重要参数。该分析的重要部分是鉴定正确的CMR视图和质量控制（QC），以检测可能影响流量定量的伪像。我们提出了一个新型的基于深度学习的框架，用于对完整CMR扫描的流量进行完全自动化的分析，该框架首先使用两个顺序卷积神经网络进行这些视图选择和QC步骤，然后进行自动主动脉和肺动脉分段，以实现对量化的量化。钥匙流参数。对于观察分类和QC，获得了0.958和0.914的精度值。对于细分，骰子分数为$> $ 0.969，而平淡的altman情节表示手动和自动峰流量值之间的一致性很高。此外，我们在外部验证数据集上测试了管道，结果表明管道的鲁棒性。这项工作是使用由986例病例组成的多生临床数据进行的，表明在临床环境中使用该管道的潜力。

translated by 谷歌翻译

Light curve completion and forecasting using fast and scalable Gaussian processes (MuyGPs)

Imène R. Goumiri , Alec M. Dunton , Amanda L. Muyskens , Benjamin W. Priest , Robert E. Armstrong

分类： (统计)机器学习

2022-08-31

明显大小的时间变化（称为光曲线）是望远镜在长时间内捕获的感兴趣的观察统计。光曲线提供了空间域意识（SDA）目标（例如对象识别或姿势估计）作为潜在变量推理问题等目标的探索。与较高的精确仪器相比，来自货架上商业架子（COTS）摄像机的地面观测仍然很便宜，但是，有限的传感器可用性与嘈杂的观察结果相结合，可能会产生可能难以建模的gappy时间序列数据。这些外部因素混淆了对光曲线的自动开发，这使光曲线预测和外推成为应用的关键问题。传统上，使用基于扩散或基于示例的方法解决了图像或时间序列的完成问题。最近，由于学习复杂的非线性嵌入方面的经验成功，深度神经网络（DNNS）已成为首选工具。但是，DNN通常需要大量的培训数据，而这些数据不一定在查看单个卫星的光曲线的独特功能时可用。在本文中，我们提出了一种新的方法，可以使用高斯工艺（GPS）预测光曲线的缺失和未来数据点。 GPS是非线性概率模型，可推断后验分布在功能上并自然量化不确定性。但是，GP推理和培训的立方缩放是其在应用中采用的主要障碍。特别是，单个光曲线可以具有数十万个观测值，这远远超出了单个机器上常规GP的实际实现极限。因此，我们采用MUYGP，这是一种可扩展的框架，用于使用最近的邻居稀疏和局部交叉验证的GP模型的超参数估计。 muygps ...

translated by 谷歌翻译

HTML版本

Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization

Tan Nguyen , Richard G. Baraniuk , Robert M. Kirby , Stanley J. Osher , Bao Wang

分类：机器学习

2022-08-01

变形金刚在序列建模及以后取得了显着的成功，但相对于输入序列的长度，二次计算和记忆复杂性遭受了损失。利用技术包括稀疏和线性的注意力和哈希技巧；已经提出了有效的变压器来降低变压器的二次复杂性，但会显着降低准确性。作为响应，我们首先将计算注意图的线性注意力和残差连接解释为梯度下降步骤。然后，我们将动量引入这些组件，并提出\ emph {动量变压器}，该动量利用动量来提高线性变压器的精度，同时保持线性内存和计算复杂性。此外，我们制定了一种自适应策略，以根据二次优化的最佳动量计算模型的动量值。这种自适应动量消除了寻找最佳动量值的需求，并进一步增强了动量变压器的性能。包括图像生成和机器翻译在内的自回归和非自动回归任务的一系列实验表明，动量变压器在训练效率和准确性方面优于流行的线性变压器。

translated by 谷歌翻译